Speaker clustering via the mean shift algorithm
نویسندگان
چکیده
In this paper, we investigate the use of the mean shift algorithm with respect to speaker clustering. The algorithm is an elegant nonparametric technique that has become very popular in image segmentation, video tracking and other image processing and computer vision tasks. Its primary aim is to detect the modes of the underlying density and consequently merge those observations being attracted by each mode. Since the number of modes is not needed to be known beforehand, the algorithm seems to fit well to the problem of speaker clustering. However, the algorithm needs to be adapted; the original algorithm acts on the space of observations, while speaker clustering algorithms act on the space of probabilistic parametric models. We attempt to adapt the algorithm, based on some basic concepts of information geometry, that are related to the exponential family of distributions.
منابع مشابه
The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering
Speaker clustering is an important task in many applications such as Speaker Diarization as well as Speech Recognition. Speaker clustering can be done within a single multispeaker recording (Diarization) or for a set of different recordings. In this work we are interested by the former case and we propose a simple iterative Mean Shift (MS) algorithm. MS algorithm is based on Euclidean distance....
متن کاملMean shift algorithm for exponential families with applications to speaker clustering
This work extends the mean shift algorithm from the observation space to the manifolds of parametric models that are formed by exponential families. We show how the Kullback-Leibler divergence and its dual define the corresponding affine connection and propose a method for incorporating the uncertainty in estimating the parameters. Experiments are carried out for the problem of speaker clusteri...
متن کاملClustering short push-to-talk segments
We present a method for clustering short push-to-talk speech segments in the presence of different numbers of speakers. Iterative Mean Shift algorithm based on the cosine distance is used to perform speaker clustering on i-vectors generated from many short speech segments. We report results as measured by the Accuracy, the average number of detected speakers (ANDS), the average cluster purity (...
متن کاملOn the Use of PLDA i-vector Scoring for Clustering Short Segments
This paper extends upon a previous work using Mean Shift algorithm to perform speaker clustering on i-vectors generated from short speech segments. In this paper we examine the effectiveness of probabilistic linear discriminant analysis (PLDA) scoring as the metric of the mean shift clustering algorithm in the presence of different number of speakers. Our proposed method, combined with k-neares...
متن کاملBilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کامل